Chinese Named Entity Recognition with Conditional Probabilistic Models

نویسندگان

  • Aitao Chen
  • Fuchun Peng
  • Roy Shan
  • Gordon Sun
چکیده

This paper describes the work on Chinese named entity recognition performed by Yahoo team at the third International Chinese Language Processing Bakeoff. We used two conditional probabilistic models for this task, including conditional random fields (CRFs) and maximum entropy models. In particular, we trained two conditional random field recognizers and one maximum entropy recognizer for identifying names of people, places, and organizations in unsegmented Chinese texts. Our best performance is 86.2% F-score on MSRA dataset, and 88.53% on CITYU dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

A Framework Based on Graphical Models with Logic for Chinese Named Entity Recognition

Chinese named entity recognition (NER) has recently been viewed as a classification or sequence labeling problem, and many approaches have been proposed. However, they tend to address this problem without considering linguistic information in Chinese NEs. We propose a new framework based on probabilistic graphical models with firstorder logic for Chinese NER. First, we use Conditional Random Fi...

متن کامل

Two Step Chinese Named Entity Recognition Based on Conditional Random Fields Models

This paper mainly describes a Chinese named entity recognition (NER) system NER@ISCAS, which integrates text, partof-speech and a small-vocabularycharacter-lists feature and heristic postprocess rules for MSRA NER open track under the framework of Conditional Random Fields (CRFs) model.

متن کامل

Robust Algorithms for Semantic Class Labeling in Chinese Query Understanding ⋆

In this paper we propose an approach to solve the words’ variation induced by automatic speech recognition (ASR) and errors by keyboard in human-computer interaction system. Considering the characteristics of Chinese, fuzzy matching based on Chinese pinyin is utilized to correct the semantic concepts in a natural language query. The approach is in two stages: first, conditional random field (CR...

متن کامل

Tree Representations in Probabilistic Models for Extended Named Entities Detection

In this paper we deal with Named Entity Recognition (NER) on transcriptions of French broadcast data. Two aspects make the task more difficult with respect to previous NER tasks: i) named entities annotated used in this work have a tree structure, thus the task cannot be tackled as a sequence labelling task; ii) the data used are more noisy than data used for previous NER tasks. We approach the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006